A speech understanding module for a multimodal mathematical formula editor
نویسندگان
چکیده
As part of a framework for a multimodal mathematical formula editor which will support natural speech and handwriting interaction, a single stage speech understanding module is presented. It is based on a multilevel statistical, expectation driven approach. Completely spoken realistic formulas containing basic arithmetic operations, roots, indexed sums, integrals, trigonometric functions, logarithms, convolutions, fourier transforms, exponentiations, and indexing (among others) were examined. The speaker specific or formula specific structural recognition accuracies reach up to 90 % or 100 %, respectively. For visualization and postprocessing purposes, a transformation into Adobe FrameMaker documents is performed. An advanced variant of this architecture will further be utilized as the basis for a multimodal semantic decoder incorporating combined script and speech analysis. It will enclose a so-called Multimodal Probabilistic Grammar which will be trained via multimodal usability tests.
منابع مشابه
Integrating HMM-Based Speech Recognition With Direct Manipulation In A Multimodal Korean Natural Language Interface
This paper presents a HMM-based speech recognition engine and its integration into direct manipulation interfaces for Korean document editor. Speech recognition can reduce typical tedious and repetitive actions which are inevitable in standard GUIs (graphic user interfaces). Our system consists of general speech recognition engine called ABrain 1 and speech commandable document editor called SH...
متن کاملSpeech Technology at Home: Enhanced Interfaces for People with Disabilities
This paper presents new advances in speech technology carried out by the Speech Technology Group (GTH) at the Universidad Politécnica de Madrid (UPM) to develop enhanced interfaces at home. These interfaces provide a better interaction for people with disabilities. The speech recognizer includes a speaker identification feature (that makes an acoustic adaptation possible for improving recogniti...
متن کاملCollaborative Design Studio : a sketch - based environment to support rich distant collaboration
An increasing number of large scale projects requires that distant teams collaborate together remotely. At the same time, the current CAD tools only offer minimal support for partial and asynchronous interactions. The application we propose enables full synchronous and remote sketch-based collaborative design. This setup is a combination of a virtual desktop (a remote meeting table), a standard...
متن کاملSpeech Signal Enhancement Using Firefly Optimization Algorithm
The speech signal enhancement is essential to obtain clean speech signal from noisy signal. For multimodal optimization, the natural-inspired algorithms such as Firefly Algorithm (FA) are better. The proposed algorithm contains preprocessing module, optimization module and spectral filtering module. Here, Loizou’s and Aurora databases are considered for signals. In this paper the Perceptional E...
متن کاملA single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas
We present a novel approach towards a multimodal analysis of natural speech and handwriting input for entering mathematical expressions into a computer. It utilizes an integrated, multilevel probabilistic architecture with a joint semantic and two distinct syntactic models describing speech and script properties, respectively. Compared to classical multistage solutions our single-stage strategy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000